624 research outputs found
Characterization of a big data storage workload in the cloud
The proliferation of big data processing platforms has led to radically different system designs, such as MapReduce and the newer Spark. Understanding the workloads of such systems facilitates tuning and could foster new designs. However, whereas MapReduce workloads have been characterized extensively, relatively little public knowledge exists about the characteristics of Spark workloads in representative environments. To address this problem, in this work we collect and analyze a 6-month Spark workload from a major provider of big data processing services, Databricks. Our analysis focuses on a number of key features, such as the long-term trends of reads and modifications, the statistical properties of reads, and the popularity of clusters and of file formats. Overall, we present numerous findings that could form the basis of new systems studies and designs. Our quantitative evidence and its analysis suggest the existence of daily and weekly load imbalances, of heavy-tailed and bursty behaviour, of the relative rarity of modifications, and of proliferation of big data specific formats
An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics
We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi
from top Computer Science programs, focusing on finding the prevalence and
context in which topics related to performance are being taught in these
courses. We also study the scale of the infrastructure mentioned in DS courses,
from small client-server systems to cloud-scale, peer-to-peer, global-scale
systems. We make eight main findings, covering goals such as performance, and
scalability and its variant elasticity; activities such as performance
benchmarking and monitoring; eight selected performance-enhancing techniques
(replication, caching, sharding, load balancing, scheduling, streaming,
migrating, and offloading); and control issues such as trade-offs that include
performance and performance variability.Comment: Accepted for publication at WEPPE 2021, to be held in conjunction
with ACM/SPEC ICPE 2021: https://doi.org/10.1145/3447545.3451197 This article
is a follow-up of our prior ACM SIGCSE publication, arXiv:2012.0055
Recommended from our members
On Implementing Autonomic Systems with a Serverless Computing Approach: The Case of Self-Partitioning Cloud Caches
The research community has made significant advances towards realizing self-tuning cloud caches; notwithstanding, existing products still require manual expert tuning to maximize performance. Cloud (software) caches are built to swiftly serve requests; thus, avoiding costly functionality additions not directly related to the request-serving control path is critical. We show that serverless computing cloud services can be leveraged to solve the complex optimization problems that arise during self-tuning loops and can be used to optimize cloud caches for free. To illustrate that our approach is feasible and useful, we implement SPREDS (Self-Partitioning REDiS), a modified version of Redis that optimizes memory management in the multi-instance Redis scenario. A cost analysis shows that the serverless computing approach can lead to significant cost savings: The cost of running the controller as a serverless microservice is 0.85% of the cost of the always-on alternative. Through this case study, we make a strong case for implementing the controller of autonomic systems using a serverless computing approach
Beyond Microbenchmarks: The SPEC-RG Vision for a Comprehensive Serverless Benchmark
Serverless computing services, such as Function-as-a-Service (FaaS), hold the attractive promise of a high level of abstraction and high performance, combined with the minimization of operational logic. Several large ecosystems of serverless platforms, both open- and closed-source, aim to realize this promise. Consequently, a lucrative market has emerged. However, the performance trade-offs of these systems are not well-understood. Moreover, it is exactly the high level of abstraction and the opaqueness of the operational-side that make performance evaluation studies of serverless platforms challenging. Learning from the history of IT platforms, we argue that a benchmark for serverless platforms could help address this challenge. We envision a comprehensive serverless benchmark, which we contrast to the narrow focus of prior work in this area. We argue that a comprehensive benchmark will need to take into account more than just runtime overhead, and include notions of cost, realistic workloads, more (open-source) platforms, and cloud integrations. Finally, we show through preliminary real-world experiments how such a benchmark can help compare the performance overhead when running a serverless workload on state-of-the-art platforms
Quantifying cloud performance and dependability:Taxonomy, metric design, and emerging challenges
In only a decade, cloud computing has emerged from a pursuit for a service-driven information and communication technology (ICT), becoming a significant fraction of the ICT market. Responding to the growth of the market, many alternative cloud services and their underlying systems are currently vying for the attention of cloud users and providers. To make informed choices between competing cloud service providers, permit the cost-benefit analysis of cloud-based systems, and enable system DevOps to evaluate and tune the performance of these complex ecosystems, appropriate performance metrics, benchmarks, tools, and methodologies are necessary. This requires re-examining old system properties and considering new system properties, possibly leading to the re-design of classic benchmarking metrics such as expressing performance as throughput and latency (response time). In this work, we address these requirements by focusing on four system properties: (i) elasticity of the cloud service, to accommodate large variations in the amount of service requested, (ii) performance isolation between the tenants of shared cloud systems and resulting performance variability, (iii) availability of cloud services and systems, and (iv) the operational risk of running a production system in a cloud environment. Focusing on key metrics for each of these properties, we review the state-of-the-art, then select or propose new metrics together with measurement approaches. We see the presented metrics as a foundation toward upcoming, future industry-standard cloud benchmarks
Carbon nanodots modified-electrode for peroxide-free cholesterol biosensing and biofuel cell design
The determination of cholesterol is greatly important because high concentrations of this biomarker are associated to heart disease. Moreover, cholesterol can be used as a fuel in enzymatic fuel cells operating under physiological conditions. Here, we present a cholesterol biosensor and a peroxide-free biofuel cell based on the electrocatalytic oxidation of the NADH generated during the enzymatic reaction of cholesterol dehydrogenase (ChDH) as an alternative to the H2O2 biosensing strategies used with cholesterol oxidase-bioelectrodes. Azure A functionalized-carbon nanodots were used as NADH oxidation electrocatalysts and for ChDH covalent immobilization. The biosensor responded linearly to cholesterol concentrations up to 1.7 mM with good sensitivity (4.50 mA cm−2 M−1) and at a low potential. The ChDH bioelectrode was combined with an O2-reducing bilirubin oxidase cathode to produce electrical energy using cholesterol as fuel and O2 as oxidant. Furthermore, the resulting enzymatic fuel cell was tested in human serum naturally containing free cholesterolA.L.DL. and M.P. thank MCIU/AEI/FEDER, EU for funding project
RTI2018–095090-B-I00. M.B. acknowledges funding from the European
Union’s Horizon 2020 Research and Innovation Program under the
Marie Skłodowska-Curie grant agreement No. 713366. This work was also supported by Talent Attraction Project from CAM (SI3/PJI/
2021–00341 and 2021–5A/BIO-20943), Spanish Ministerio de Ciencia e
Innovacion (PID2020–116728RB-I00) and TRANSNANOAVANSENSCAM Program (S2018/NMT-4349
- …